Evaluation of Integrated Error Processing and Fault Diagnosis in Multiprocessor Systems
نویسندگان
چکیده
This paper deals with multiprocessor systems required to provide both high performance and good figures of dependability attributes. Fault tolerance is pursued through a proper combination and integration of a diagnostic mechanism, called α count, with simple instances of redundancy-based error processing structures. The resulting fault tolerance strategies are then evaluated through a stochastic simulation approach, as dictated by the high interdependence of the selected mechanisms. The analysis is mainly carried out in terms of performability, which shows to be an appropriate measure to evaluate whether a certain design is "better" than another under dependability and performance points of view.
منابع مشابه
An approach to fault detection and correction in design of systems using of Turbo codes
We present an approach to design of fault tolerant computing systems. In this paper, a technique is employed that enable the combination of several codes, in order to obtain flexibility in the design of error correcting codes. Code combining techniques are very effective, which one of these codes are turbo codes. The Algorithm-based fault tolerance techniques that to detect errors rely on the c...
متن کاملEfficient Fault Tolerance: An Approach to Deal with Transient Faults in Multiprocessor Architectures
Dynamic error processing approaches are an important mechanism to increase the reliability in a multiprocessor system, while making efficient use of the available resources. To this end, dynamic error processing must be integrated with a fault treatment approach aiming at optimising resource utilisation. In this paper we propose a diagnosis approach that, accounting for transient faults, tries ...
متن کاملProposing an Efficient Software-based Method to Enhance Reliability of Computer Systems against Soft Errors
In recent years, along with rapid developments in technology, computer systems haveincreasingly become more integrated and more modular. Indeed, the reliability and efficiency ofcomputer systems are of high significance. Hence, the quantitative evaluation of the optimizationof reliability indexes in computer systems is considered to be a crucial issue. Reliabilityenhancement of computer systems...
متن کاملEvaluation of Fault-Tolerant Multiprocessor Systems for High Assurance Applications
In designing high assurance systems, the dependability goals are achieved through the adoption of several fault tolerance techniques. Unfortunately, their combined effect on the system cannot be, in the general case, derived by straightforward composition of the stand-alone component's analysis, because of mutual dependence of their controlling parameters. In this paper the assessment of overal...
متن کاملDeterministic execution of multithreaded applications for reliability of multicore systems
With the advent of modern nano-scale technology, it has become possible to implement multiple processing cores on a single die. The shrinking transistor sizes however have made reliability a concern for such systems as smaller transistors are more prone to permanent as well as transient faults. To reduce the probability of failures of such systems, online fault tolerance techniques can be appli...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2000